62 research outputs found
Transferring Procedural Knowledge across Commonsense Tasks
Stories about everyday situations are an essential part of human
communication, motivating the need to develop AI agents that can reliably
understand these stories. Despite the long list of supervised methods for story
completion and procedural understanding, current AI has no mechanisms to
automatically track and explain procedures in unseen stories. To bridge this
gap, we study the ability of AI models to transfer procedural knowledge to
novel narrative tasks in a transparent manner. We design LEAP: a comprehensive
framework that integrates state-of-the-art modeling architectures, training
regimes, and augmentation strategies based on both natural and synthetic
stories. To address the lack of densely annotated training data, we devise a
robust automatic labeler based on few-shot prompting to enhance the augmented
data. Our experiments with in- and out-of-domain tasks reveal insights into the
interplay of different architectures, training regimes, and augmentation
strategies. LEAP's labeler has a clear positive impact on out-of-domain
datasets, while the resulting dense annotation provides native explainability
BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
The success of language models has inspired the NLP community to attend to
tasks that require implicit and complex reasoning, relying on human-like
commonsense mechanisms. While such vertical thinking tasks have been relatively
popular, lateral thinking puzzles have received little attention. To bridge
this gap, we devise BRAINTEASER: a multiple-choice Question Answering task
designed to test the model's ability to exhibit lateral thinking and defy
default commonsense associations. We design a three-step procedure for creating
the first lateral thinking benchmark, consisting of data collection, distractor
generation, and generation of adversarial examples, leading to 1,100 puzzles
with high-quality annotations. To assess the consistency of lateral reasoning
by models, we enrich BRAINTEASER based on a semantic and contextual
reconstruction of its questions. Our experiments with state-of-the-art
instruction- and commonsense language models reveal a significant gap between
human and model performance, which is further widened when consistency across
adversarial formats is considered. We make all of our code and data available
to stimulate work on developing and evaluating lateral thinking models
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
Retrieval-augmented language models (RALMs) represent a substantial
advancement in the capabilities of large language models, notably in reducing
factual hallucination by leveraging external knowledge sources. However, the
reliability of the retrieved information is not always guaranteed. The
retrieval of irrelevant data can lead to misguided responses, and potentially
causing the model to overlook its inherent knowledge, even when it possesses
adequate information to address the query. Moreover, standard RALMs often
struggle to assess whether they possess adequate knowledge, both intrinsic and
retrieved, to provide an accurate answer. In situations where knowledge is
lacking, these systems should ideally respond with "unknown" when the answer is
unattainable. In response to these challenges, we introduces Chain-of-Noting
(CoN), a novel approach aimed at improving the robustness of RALMs in facing
noisy, irrelevant documents and in handling unknown scenarios. The core idea of
CoN is to generate sequential reading notes for retrieved documents, enabling a
thorough evaluation of their relevance to the given question and integrating
this information to formulate the final answer. We employed ChatGPT to create
training data for CoN, which was subsequently trained on an LLaMa-2 7B model.
Our experiments across four open-domain QA benchmarks show that RALMs equipped
with CoN significantly outperform standard RALMs. Notably, CoN achieves an
average improvement of +7.9 in EM score given entirely noisy retrieved
documents and +10.5 in rejection rates for real-time questions that fall
outside the pre-training knowledge scope.Comment: Preprin
Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering
Recent developments in pre-trained neural language modeling have led to leaps
in accuracy on commonsense question-answering benchmarks. However, there is
increasing concern that models overfit to specific tasks, without learning to
utilize external knowledge or perform general semantic reasoning. In contrast,
zero-shot evaluations have shown promise as a more robust measure of a model's
general reasoning abilities. In this paper, we propose a novel neuro-symbolic
framework for zero-shot question answering across commonsense tasks. Guided by
a set of hypotheses, the framework studies how to transform various
pre-existing knowledge resources into a form that is most effective for
pre-training models. We vary the set of language models, training regimes,
knowledge sources, and data generation strategies, and measure their impact
across tasks. Extending on prior work, we devise and compare four constrained
distractor-sampling strategies. We provide empirical results across five
commonsense question-answering tasks with data generated from five external
knowledge resources. We show that, while an individual knowledge graph is
better suited for specific tasks, a global knowledge graph brings consistent
gains across different tasks. In addition, both preserving the structure of the
task as well as generating fair and informative questions help language models
learn more effectively.Comment: AAAI 202
Effect of Dextrose Equivalent on Maltodextrin/Whey Protein Spray-Dried Powder Microcapsules and Dynamic Release of Loaded Flavor during Storage and Powder Rehydration
peer reviewedThe preparation of powdered microcapsules of flavor substances should not only protect
these substances from volatilization during storage but also improve their di usion during use.
This study aimed to investigate the e ects of maltodextrin (MD) with di erent dextrose equivalent
(DE) values on retention of flavor substances during storage, and the dynamic release of flavor
substances during dissolution. MDs with three di erent DE values and whey protein isolate were
mixed in a ratio of 4:1 as wall materials to encapsulate ethyl acetate, and powdered microcapsules
were prepared by spray drying. It was proved that MD could reduce the di usion of flavor substances
under di erent relative humidity conditions through the interaction between core material and wall
material. During dissolution, MD released flavor substances quickly owing to its superior solubility.
The reconstituted emulsion formed after the powder dissolved in water recaptured flavor substances
and made the system reach equilibrium. This study explored the mechanism of flavor release during
the storage and dissolution of powder microcapsules and should help us understand the application
of powder microcapsules in food systems
- …